SCS Honors Senior Thesis 2022

SCS Honors Senior Thesis 2022 (Pittsburgh)

Take a look at the senior honors thesis research done by our undergraduates this semester. Click on the title link to view a pdf of the associated poster (if available). If you have questions about a student's work, feel free to email them using the indicated andrew ID (andrewID@andrew.cmu.edu).

Also feel free to attend their presentations on Wednesday, May 4.
GHC 4401, Rashid Audtorium
Zoom link available (see email)
All times are Eastern Daylight Time.

10:00AM	Fern Limprayoon	Path Guiding for People with Visual Impairments for Short-Range Rendezvous with a Static Target
10:20AM	Akhil Nadigatla	Towards Efficient Neural Zeroth-Order Optimization Algorithms
10:40AM	Daniel Schaffer	A Comparative Genomics Approach to Identifying Candidate Enhancers Associated with Phenotypes
11:00AM	Alan Lai	A Computational Analysis on the Hierarchical Structure of Animal Behavior
11:20AM	Ryan Liu	Identifying Human Biases in Conference Peer Review via Real-Subject Experiments
11:40AM	David Park	Cross-Modality Supervised Learning for 3D Object Detection
12:00PM	Gustavo Silvera	WORK-IN-PROGRESS: Improving Driving Assistance With Gaze Detection
12:10PM	Olivia Xu	WORK-IN-PROGRESS: How Firms Engage External Developer Communities to Co-create Software
12:20PM	LUNCH BREAK
1:00PM	Steven Lu	Efficiently Estimating Neighborhood Size and Distances in Large Undirected Graphs
1:20PM	Maggie Zhang	The Use of Vocal Fold Dynamics for Automatic Detection of ALS Through Voice
1:40PM	Gabriel Rasskin	Self-Supervised Multimodal Representation Learning
2:00PM	Aditi Gupta	Ergometric Multilinear Futures: A Foundation for Efficient Parallelism
2:20PM	Simran Kaur	On the Dubious Relationship Between Flatness and Generalization
2:40PM	Harrison Grodin	Compiler Flags as Types

ABSTRACTS (alphabetical author by presenter)

Harrison Grodin, hgrodin
Research Advisor: Robert Harper
Compiler Flags as Types
Current programming languages often have a clear semantics for their core language; however, they typically relegate usability issues to ad-hoc features enabled by compiler flags. In this work, we develop a mathematical basis for these flags using the topos-theoretic generalization of phase distinctions given by Sterling and Harper, making flags a first-class language feature. Each flag has a corresponding type that can introduce or alter features of the programming language; by taking in flags as function inputs, programmers can make flag requirements on a program explicit. In this way, we consolidate a wide variety of common language features and extensions; for example, we reconcile abstraction via ML-style modules, composability via effect preconditions, debugging via logging and testing frameworks, and portability via system and version requirements.

Aditi Gupta, aditig
Research Advisor: Frank Pfenning
Ergometric Multilinear Futures: A Foundation for Efficient Parallelism
In functional languages, it can be challenging to implement parallelism through futures efficiently. Two significant bottlenecks are memory management and granularity control in scheduling of parallel tasks. In this paper, we provide the theoretical basis for addressing both of these challenges. Our language is based on a mixed linear/nonlinear language, formulated in a semi-axiomatic sequent calculus so that its natural computational interpretation encompasses futures. Linearity allows eager deallocation, easing garbage collection. We identify multilinear types that may be shared among threads while retaining the advantages of linearity; parallel garbage collection on multilinear data uses simple reference counting. Finally, we augment this language with ergometric types to capture the total work of program execution under a flexible model of amortized cost based on potentials. This allows for informed scheduling decisions and partially-automated granularity control in both purely linear and multilinear settings. The potential inherent in a multilinear data structure is statically known and split among the threads accessing it. We prove type safety, which includes accuracy of reference counts and accounting of potential, and illustrate our language with small examples.

Simran Kaur, skaur
Research Advisor: Zachary Lipton
On the Dubious Relationship Between Flatness and Generalization
Certain training interventions, such as setting larger learning rates or applying batch normalization, improve the generalization performance of neural networks. Yet, why these techniques lead to better generalization remains a mystery. Some researchers have posited that the flatness of a learned solution influences its generalization to unseen data, with flatter solutions generalizing better. Motivated by this intuition, researchers have proposed a number of metrics for measuring flatness and a number of algorithms, e.g., such as Sharpness-Aware Minimization (SAM) [Foret et al., 2020], that directly optimize for flatness in the hopes of improving generalization, And yet researchers have long known that one can arbitrarily increase the sharpness, as quantified by the maximum eigenvalue of the Hessian of the loss evaluated at a given solution without changing the generalization error [Dinh et al., 2017], raising doubts about how closely flatness and generalization are linked. In this paper, we add the growing body of work that calls into question the influence of flatness on generalization. We show that: (1) for all batch sizes, large learning rates reliably produce flatter minima; however, at large batch sizes, these flat minima seldom generalize better than the sharp minima found at small learning rates; (2) for all batch sizes, SAM produces flatter solutions; however, its generalization benefits (also) diminish as batch size increases, vanishing entirely in the full-batch setting; (3) while dropout promotes flatness, excessively high dropout probabilities degrade generalization; and (4) while batch-normalization does not promote flatter solutions in the large-batch setting, it does confer generalization benefits. While our experiments affirm the generalization benefits of large learning rates and SAM for minibatch SGD, the GD-SGD discrepancy makes clear that sharpness alone cannot account for their efficacy. Similarly, while our experiments confirm the benefits of batch-normalization, the comparably flat solutions found with and without batch normalization further support the finding that sharpness does not provide a complete account of generalization in deep neural networks.

Alan Lai, alanlai
Research Advisors: Eric Yttri and Leila Wehbe
A Computational Analysis on the Hierarchical Structure of Animal Behavior
The neural mechanisms of naturalistic behavior have remained poorly studied due to the complexity of such analysis. Traditionally, behavioral analysis involves the study of highly reductionist and over-trained behaviors where an agent is artificially limited to a specific set of possible actions to perform. While there has been a great deal of work establishing the role of certain neural systems in behavior through such experimental setups it is difficult to make these same inferences about behavior in situations like our daily lives where there does not exist precise experimental control. This work attempts to better elucidate the roles of neural circuitry critical for decision-making by first establishing a rigorous means to study behavior. Using B-SOiD, an open-source, unsupervised learning algorithm to identify naturalistic behaviors, we study behavioral structure through a variety of computational methods, including the use of a Markov model, language models, and grammatical inference.

Fern Limprayoon, jlimpray
Research Advisor: Aaron Steinfeld
Path Guiding for People with Visual Impairments for Short-Range Rendezvous with a Static Target
Navigating to a static object in complex indoor spaces like train stations, shopping malls, and university buildings can be challenging for people with visual impairments since current path planning approaches rely on users to spontaneously adjust their trajectories based on their visual knowledge when rendezvousing with them. To help a user with a visual impairment rendezvous with a static object (e.g., a waiting robot, a kiosk, an autonomous car at the curbside, etc.) in a seamless way, we propose an approach that can incorporate information about the user’s position and orientation from sensors attached to the environment, nearby robots, or cars and dynamically plan and give users instructions of how to reach their goal.

Ryan Liu, ryanliu
Research Advisor: Nihar Shah
Identifying Human Biases in Conference Peer Review via Real-Subject Experiments
Computer science and its neighboring fields have relied on conference peer review to distinguish impactful, high quality research thanks to its high scalablility and efficiency. However, a concern with our current system is the large library of cognitive biases and human factors that may affect its desired objectivity. In this presentation, we focus on two select biases in the peer review process, one of which is citation bias, where a reviewer that has their own work cited in a submission is consequently more lenient towards the paper. We describe experiments designed to tease apart confounding factors and test for the presence of these biases, and present analyses to answer important questions about the prevalence and effect of said factors in the peer review setting.

Steven Lu, sslu1
Research Advisor: Guy Blelloch
Efficiently Estimating Neighborhood Size and Distances in Large Undirected Graphs
Querying certain properties in large graphs can be very expensive. Thus, rather than exact queries, estimations are often used instead. Our work focuses on least elements lists (le-lists), which can be used to estimate properties like neighborhood size and distance between two vertices. In particular, our research focuses on computing these le-lists in parallel. The greatest amount of parallelism is gained from undirected graphs with small diameter. Thus, we focus on graphs with such properties, in particular social media graphs representing relationships between users, as these often have these exact properties. We perform performance benchmarking to determine the benefit of parallelism on such graphs, as well as testing how certain applications of le-lists like distance or neighborhood size estimation perform on such graphs. Additionally, we present a new “compressed” version of le-lists which reduces the space required to store them, while retaining all the information provided. This, along with the parallelism, allows us to experiment on larger graphs than previous work on neighborhood size and distance estimation.

Akhil Nadigatla, anadigat
Research Advisor: Pradeep Ravikumar
Towards Efficient Neural Zeroth-Order Optimization Algorithms
In zeroth-order optimization (ZOO), our goal is to minimize an unknown function given noisy zeroth-order oracle access to it; that is, the function is only accessible via noisy point evaluations within some domain. ZOO has a wide range of applications in computer science, machine learning, and other engineering disciplines. For example, in machine learning ZOO techniques are often used for hyperparameter tuning. While ZOO has gained substantial attention from research communities over the years, existing techniques exhibit one or more of the following drawbacks: (1) they make restrictive structural assumptions on the objective function that rarely hold in practice, (2) they are computationally expensive and do not scale well to high-dimensional problems, and (3) they make too many queries to the zeroth-order oracle. Towards this end, we propose an iterated version of Neural Thompson Sampling which makes use of Stochastic Gradient Langevin Dynamics (SGLD) in choosing the neural network to be fit. We also provide a Neural Upper Confidence Bound-based algorithm that does not make any strict assumptions nor utilizes kernels, unlike others proposed in the literature. In addition, we assess the performance of a greedy technique, and explain other interesting phenomena such as the impact of parameter variance on smoothness of function fit.

David Park, jinhyun1
Research Advisor: Kris Kitani
Cross-Modality Supervised Learning for 3D Object Detection
Point clouds, RGB images, and radar signals are mutually complementary modalities for 3D understanding. Point clouds provide accurate but sparse 3D locations of points on objects, RGB images contain dense color and texture information, and radar, despite its high sparsity, works well under adverse sensor scenarios. In spite of this potential for cooperative sensor fusion, existing works prefer to treat point clouds as the primary modality, extracting features from RGB or radar solely to enhance point semantics. This single-direction fusion prevents improved point features from being used to strengthen RGB and radar features and task predictions, potentially resulting in sub-optimal performance. In this thesis, we propose to fully leverage this mutually beneficial relationship between modalities through recursive, cascaded modality feature fusion, cross-modality filtering for semi-supervised learning, and multi-modality consistency enforcement. Our experiments show that each type of fusion significantly improves perception performance.

Gabriel Rasskin, grasskin
Research Advisor: Louis-Philippe (LP) Morency
Self-Supervised Multimodal Representation Learning
Most real-world data is inherently multimodal and unlabeled: spoken language and its written transcription, visual information and the objects that can be seen, depth and color information. It is an open question how best to learn representations that are later useful for downstream tasks. In this work we attempt to improve and understand current self-supervised multimodal representation learning quantitatively and qualitatively. We propose a novel architecture that incorporates multiple modalities into its learning pipeline.

Daniel Schaffer, dschaffe
Research Advisor: Andreas Pfenning
A Comparative Genomics Approach to Identifying Candidate Enhancers Associated with Phenotypes
A large diversity of traits exists within the mammals. Many observed phenotypic differences are likely due to differences in the activity of enhancers, short regions of the genome involved in regulating the expression of nearby genes. Studying enhancer evolution is difficult because enhancer activity tends to be cell type-specific, and we cannot obtain most tissues and cell types from most species. We can obtain open chromatin regions (OCRs), which are a proxy for enhancers, in brain regions and cell types of a few species and use them to train machine learning models for predicting open chromatin. We can then identify orthologs of those OCRs in a large, diverse set of mammalian genomes recently generated by the Zoonomia Project and predict whether those orthologs have open chromatin. I used the predicted open chromatin to develop a method for associating OCRs with specific phenotypes. Applying this method for motor cortex and parvalbumin-positive neurons to neurological phenotypes revealed dozens of new OCR-phenotype associations. Many associated OCRs were close, both in the linear genome and in the 3D conformation of chromatin, to relevant genes, including brain size-associated OCRs near genes mutated in microcephaly or macrocephaly.

Gustavo Silvera. gsilvera
Research Advisor: Henny Admoni
WORK-IN-PROGRESS: Improving Driving Assistance With Gaze Detection

Olivia Xu, okx
Research Advisor: Chinmay Kulkarni
WORK-IN-PROGRESS: How Firms Engage External Developer Communities to Co-create Software

Maggie Zhang, jiayizha
Research Advisor: Rita Singh
The Use of Vocal Fold Dynamics for Automatic Detection of ALS Through Voice
Amyotrophic Lateral Sclerosis (ALS) is a progressive neurodegenerative disease. Current diagnostic methods for ALS are complicated and rely on subjective judgements from physicians. This situation motivates the development of an expedient and objective diagnostic aid. Since ALS affects motor neurons and causes dysfunctions in speech and respiration, we hypothesized that analyses of features that capture the essential characteristics of the biomechanical process of voice production can successfully distinguish ALS patients from non-ALS controls. We focus on representing voices with algorithmically estimated vocal fold dynamics from physical models of phonation and aim to validate our hypothesis by identifying a set of features that are effective for our desired separation. To achieve our goal, we have explored 2 main sets of features: simple statistical measurements (Set 1) and phase-space characterizations (Set 2) of estimated vocal fold displacements and of range of displacements. Random Forest Classifiers based on these sets of features were used to differentiate voices of ALS and non-ALS individuals. In 10-way cross-validation experiments, said classifiers, based on Set 1 and 2, yielded an average AUC-ROC of 99.6% and 82.3%, respectively. These results demonstrate the potential of the use of vocal fold dynamics in detecting ALS from voice recordings.